Loop Transformations for Performance and Message Latency Hiding in Parallel Object-oriented Frameworks (extended Abstract)

نویسندگان

  • Federico Bassetti
  • Kei Davis
  • Dan Quinlan
چکیده

Application codes reliably achieve performance far less than the advertised capabilities of existing architectures, and this problem is worsening with increasingly-parallel machines. For large-scale numerical applications, stencil operations are often impose the greater part of the computational cost, and the primary sources of ineeciency are the costs of message passing and poor cache utilization. This paper proposes and demonstrates optimizations for stencil and stencil-like computations for both serial and parallel environments that ameliorate these sources of ineeciency. Additionally, we argue that when stencil-like computations are encoded at a high level using object-oriented parallel array class libraries these optimizations, which are beyond the capability of compilers, may be automated.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linear and Extended Linear Transformations for Shared-Memory Multiprocessors

Advances in program transformation frameworks have signi"cantly advanced compiler technology over the past few years. Program transformation frameworks provide mathematical abstractions of loop and data structures and formal methods for manipulating these structures. It is these frameworks that have allowed the development of algorithms capable of automatically tailoring an application for a ta...

متن کامل

A New Communication and Computation Overlapping Model with Loop Sub-Partitioning and Dynamic Scheduling

The latency hiding techniques can significantly improve the performance of the parallel programs in distributed memory systems. This paper presents a communication and computation overlapping model to hide the communication latency in data parallel programs. The communication and computation overlapping model makes use of the loop subpartitioning scheme in which a given loop partition is partit...

متن کامل

A Technique for Documenting the Framework of an Object-Oriented System

This paper presents techniques for documenting the design of frameworks for object-oriented qystems and applies the approach to the design of a configurable message passing system. The technique decomposes a framework into six concerns: the class hierarchy, protocols, control flow, synchronization, entity relationships and configurations of the system. An abstract description of each concern is...

متن کامل

Optimizing Transformations of Stencil Operations for Parallel Object-Oriented Scientific Frameworks on Cache-Based Architectures

High-performance scientific computing relies increasingly on high-level large-scale object-oriented software frameworks to manage both algorithmic complexity and the complexities of parallelism: distributed data management, process management, inter-process communication, and load balancing. This encapsulation of data management, together with the prescribed semantics of a typical fundamental c...

متن کامل

Latency Hiding in Parallel Systems: A Quantitative Approach

In many parallel applications, network latency causes a dramatic loss in processor utilization. This paper examines software pipelining as a technique for network latency hiding. It quanti es the potential improvements with detailed, instruction-level simulations. The benchmarks used are the Livermore Loop kernels and BLAS Level 1. These were parallelized and run on the instruction-level RISC s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998